Search CORE

1,282 research outputs found

Génération aléatoire uniforme de mots de langages rationnels

Author: Denise Alain
Publication venue: Published by Elsevier B.V.
Publication date: 28/05/1996
Field of study

RésuméNous donnons deux algorithmes de génération aléatoire et uniforme de mots, qui s'appliquent à des classes particulières de langages rationnels. Leur efficacité est mesurée en termes de complexité logarithmique, en fonction de la longueur n des mots engendrés. Le premier algorithme est dédié aux langages dont les séries génératrices possèdent un unique pôle, éventuellement multiple; sa complexité en temps est de l'ordre de n log n, et l'espace mémoire occupé est en log n. Le second algorithme est réservé aux langages dont les séries génératrices possèdent la propriété suivante: il existe un unique pôle de plus petit module, et ce pôle est simple. Après un pré-traitement en temps polynomial en n, le tirage aléatoire de tout mot s'effectue en temps moyen et espace linéaires.AbstractThe problem of generating uniformly at random words of a given language has been the subject of extensive study in the last few years. An important part of that work is devoted to the generation of words of context-free languages (see, e.g., [6, 8, 9, 12]). For a given integer n > 0, the words of length n > 0 of any unambiguous context-free language can be generated uniformly at random by using algorithms derived from the general method which was introduced by Wilf [14, 15] and systematized by Flajolet et al. [7]. Clearly, this can be applied to the set of rational languages, which constitute an important special case of context-free languages.Most authors use the uniform measure of complexity (see [1]) in order to compute the complexity of the algorithms of generation. This measure is based on the following hypotheses: any simple arithmetic operation (addition, multiplication) has time cost 0(1), and a constant amount of memory space is taken by any number. Thus, we know that words of any rational language can be generated by using an algorithm which, with respect to the uniform measure of complexity, runs in linear time (in terms of the length of the words) and constant space [9]. This measure is realistic only if there is a reasonable bound on the numbers involved in the operations. However, the classical random generation algorithms involve operations on numbers which grow exponentially in terms of the length of the words to be generated. Moreover, the programs which make use of these algorithms are generally used to generate very large words, for example for the purpose of studying the asymptotic behavior of some parameters. Therefore, the uniform measure does not reflect the real behavior of such programs. It turns out that the logarithmic measure of complexity is much more realistic: one assumes that the space taken by a number k is O(log k), and that any simple arithmetic operation can be done in time O(log k). It is with respect to this measure that we will evaluate the performance of algorithms in this paper.Our goal is to design efficient algorithms (in terms of logarithmic complexity) to generate uniformly at random words from certain classes of rational languages. We consider rational languages defined by their minimal finite deterministic automata. When computing complexity, neither the size of the automaton nor the cardinality of the alphabet are taken in account.In Section 2 we present some background on rational languages and their generating series. We describe briefly the classical method for generating words of such languages and we study its logarithmic complexity. We show that it is at best quadratic for most languages. This is due mainly to computations on numbers which grow exponentially with the length of the words to be generated. In order to improve significantly the efficiency of the algorithms, we must avoid handling of large numbers, or at least decrease substantially the frequency of computations on such numbers. Another alternative, briefly discussed in [7] and [12], is to compute with floating point numbers instead of integers. In this case, the logarithmic complexity is time-linear. However, using floating point numbers leads inevitably to approximations which prevent the exact uniformity of the generation.In Sections 3 and 4 we show that, in some cases, we can avoid computations on large numbers entirely or almost entirely, while keeping the exact uniformity of the generation. We determine two classes of rational languages for which this is the case.Section 3 concerns languages whose associated generating series have a unique singularity. We present a simple version of the classical algorithm, which totally avoids handling of large numbers. The logarithmic complexity of the method is O(n log n) in time and O(log n) in memory space.Section 4 focuses on languages whose associated generating series have the following property: there exists a unique singularity of minimum modulus, and this singularity is simple. For such languages we give a probabilistic version of the classical algorithm which generates words randomly while avoiding most computations on large numbers. This method needs a preprocessing stage, which can be done in polynomial time and linear space in terms of the length n of the words. Following preprocessing, any word of length n can be generated in average linear time and space

Elsevier - Publisher Connector

Tree decomposition and parameterized algorithms for RNA structure-sequence alignment including tertiary interactions and pseudoknots

Author: Barth Dominique
Denise Alain
Ponty Yann
Rinaudo Philippe
Publication venue
Publication date: 17/06/2012
Field of study

We present a general setting for structure-sequence comparison in a large class of RNA structures that unifies and generalizes a number of recent works on specific families on structures. Our approach is based on tree decomposition of structures and gives rises to a general parameterized algorithm, where the exponential part of the complexity depends on the family of structures. For each of the previously studied families, our algorithm has the same complexity as the specific algorithm that had been given before.Comment: (2012

arXiv.org e-Print Archive

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

HAL-Polytechnique

HAL UVSQ

HAL-Rennes 1

Flexible RNA design under structure and sequence constraints using formal languages

Author: Denise Alain
Ponty Yann
Vialette Stéphane
Waldispühl Jérôme
Zhang Yi
Zhou Yu
Publication venue
Publication date: 01/08/2013
Field of study

The problem of RNA secondary structure design (also called inverse folding) is the following: given a target secondary structure, one aims to create a sequence that folds into, or is compatible with, a given structure. In several practical applications in biology, additional constraints must be taken into account, such as the presence/absence of regulatory motifs, either at a specific location or anywhere in the sequence. In this study, we investigate the design of RNA sequences from their targeted secondary structure, given these additional sequence constraints. To this purpose, we develop a general framework based on concepts of language theory, namely context-free grammars and finite automata. We efficiently combine a comprehensive set of constraints into a unifying context-free grammar of moderate size. From there, we use generic generic algorithms to perform a (weighted) random generation, or an exhaustive enumeration, of candidate sequences. The resulting method, whose complexity scales linearly with the length of the RNA, was implemented as a standalone program. The resulting software was embedded into a publicly available dedicated web server. The applicability demonstrated of the method on a concrete case study dedicated to Exon Splicing Enhancers, in which our approach was successfully used in the design of \emph{in vitro} experiments.Comment: ACM BCB 2013 - ACM Conference on Bioinformatics, Computational Biology and Biomedical Informatics (2013

arXiv.org e-Print Archive

HAL-CentraleSupelec

Crossref

INRIA a CCSD electronic archive server

Hal-Diderot

HAL-Ecole des Ponts ParisTech

HAL-Polytechnique

HAL - UPEC / UPEM

Average complexity of the Jiang-Wang-Zhang pairwise tree alignment algorithm and of a RNA secondary structure alignment algorithm

Author: Denise Alain
Dulucq Serge
Herrbach Claire
Publication venue: 'Elsevier BV'
Publication date: 01/01/2010
Field of study

International audienceWe prove that the average complexity of the pairwise ordered tree alignment algo- rithm of Jiang, Wang and Zhang is in O(nm), where n and m stand for the sizes of the two trees, respectively. We show that the same result holds for the aver- age complexity of pairwise comparison of RNA secondary structures, using a set of biologically relevant operations

HAL-CentraleSupelec

CiteSeerX

Elsevier - Publisher Connector

INRIA a CCSD electronic archive server

HAL-Polytechnique

HAL-Rennes 1

VARNA: Interactive drawing and editing of the RNA secondary structure.

Author: Darty Kévin
Denise Alain
Ponty Yann
Publication venue: 'Oxford University Press (OUP)'
Publication date: 27/04/2009
Field of study

International audienceDESCRIPTION: VARNA is a tool for the automated drawing, visualization and annotation of the secondary structure of RNA, designed as a companion software for web servers and databases. FEATURES: VARNA implements four drawing algorithms, supports input/output using the classic formats dbn, ct, bpseq and RNAML and exports the drawing as five picture formats, either pixel-based (JPEG, PNG) or vector-based (SVG, EPS and XFIG). It also allows manual modification and structural annotation of the resulting drawing using either an interactive point and click approach, within a web server or through command-line arguments. AVAILABILITY: VARNA is a free software, released under the terms of the GPLv3.0 license and available at http://varna.lri.fr. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

PubMed Central

HAL-Polytechnique

HAL-Rennes 1

Generating functions for generating trees

Author: Banderier Cyril
Bousquet-Melou Mireille
Denise Alain
Flajolet Philippe
Gardy Daniele
Gouyou-Beauchamps Dominique
Publication venue: 'Elsevier BV'
Publication date: 01/01/2002
Field of study

Certain families of combinatorial objects admit recursive descriptions in terms of generating trees: each node of the tree corresponds to an object, and the branch leading to the node encodes the choices made in the construction of the object. Generating trees lead to a fast computation of enumeration sequences (sometimes, to explicit formulae as well) and provide efficient random generation algorithms. We investigate the links between the structural properties of the rewriting rules defining such trees and the rationality, algebraicity, or transcendence of the corresponding generating function.Comment: This article corresponds, up to minor typo corrections, to the article submitted to Discrete Mathematics (Elsevier) in Nov. 1999, and published in its vol. 246(1-3), March 2002, pp. 29-5

arXiv.org e-Print Archive

HAL-CentraleSupelec

Elsevier - Publisher Connector

Crossref

INRIA a CCSD electronic archive server

Uniform Random Sampling of Traces in Very Large Models

Author: Collaboration the RaST
Denise Alain
Gaudel Marie-Claude
Gouraud Sandrine-Dominique
Lasseigne Richard
Peyronnet Sylvain
Publication venue
Publication date: 01/01/2006
Field of study

This paper presents some first results on how to perform uniform random walks (where every trace has the same probability to occur) in very large models. The models considered here are described in a succinct way as a set of communicating reactive modules. The method relies upon techniques for counting and drawing uniformly at random words in regular languages. Each module is considered as an automaton defining such a language. It is shown how it is possible to combine local uniform drawings of traces, and to obtain some global uniform random sampling, without construction of the global model

arXiv.org e-Print Archive

HAL-CentraleSupelec

Crossref

Hal-Diderot

A new dichotomic algorithm for the uniform random generation of words in regular languages (journal version)

Author: Denise Alain
Gaudel Marie-Claude
Oudinet Johan
Publication venue: 'Elsevier BV'
Publication date: 02/09/2013
Field of study

International audienceWe present a new algorithm for generating uniformly at random words of any regular language

\mathcal{L}

. When using floating point arithmetics, its bit-complexity is

\mathcal{O}(q \log^2 n)

in space and

\mathcal{O}(q n \log^2 n)

in time, where

n

stands for the length of the word, and

q

stands for the number of states of a finite deterministic automaton of

\mathcal{L}

. We implemented the algorithm and compared its behavior to the state-of-the-art algorithms, on a set of large automata from the VLTS benchmark suite. Both theoretical and experimental results show that our algorithm offers an excellent compromise in terms of space and time requirements, compared to the known best alternatives. In particular, it is the only method that can generate long paths in large automata

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

HAL-Polytechnique

HAL-Rennes 1

Counting RNA pseudoknotted structures

Author: Denise Alain
Regnier Mireille
Saule Cédric
Steyaert Jean-Marc
Publication venue: 'Mary Ann Liebert Inc'
Publication date: 01/10/2011
Field of study

International audienceIn 2004, Condon and coauthors gave a hierarchical classification of exact RNA structure prediction algorithms according to the generality of structure classes that they handle. We complete this classification by adding two recent prediction algo- rithms. More importantly, we precisely quantify the hierarchy by giving closed or asymptotic formulas for the theoretical number of structures of given size n in all the classes but one. This allows to assess the tradeoff between the expressiveness and the computational complexity of RNA structure prediction algorithms

HAL-CentraleSupelec

Crossref

INRIA a CCSD electronic archive server

PubMed Central

HAL-Polytechnique

HAL-Rennes 1

Homology modeling of complex structural RNAs

Author: Barba Matthieu
Denise Alain
Ponty Yann
Rinaudo Philippe
Wang Wei
Publication venue: HAL CCSD
Publication date: 01/06/2016
Field of study

National audienceAligning macromolecules such as proteins, DNAs and RNAs in order to reveal, or conversely exploit, their functional homology is a classic challenge in bioinformatics, with farreaching applications in structure modelling and genome annotations. In the specific context of complex RNAs, featuring pseudoknots, multiple interactions and noncanonical base pairs, multiple algorithmic solutions and tools have been proposed for the structure/sequence alignment problem. However, such tools are seldom used in practice, due in part to their extreme computational demands, and because of their inability to support general types of structures. Recently, a general parameterized algorithm based on tree decomposition of the query structure has been designed by Rinaudo et al. We present an implementation of the algorithm within a tool named LiCoRNA. We compare it against stateoftheart algorithms. We show that it both gracefully specializes into a practical algorithm for simple classes pseudoknot, and offers a general solution for complex pseudoknots, which are explicitly outofreach of competing softwares

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

HAL-CEA

HAL-Polytechnique

HAL-Rennes 1